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Abstract 

The variation of a martingale Pq = po, ■ ■ ■ ,Pk of probabilities on a 
finite (or countable) set X is denoted V(j?q) and defined by V(pq) = 

=1 \\Pt — P*— a ||l ) - It is shown that V(p ) < y / 2kH(po), where 



H(p) is the entropy function H(p) = — J2 x p(x) logp(x) and log stands 
for the natural logarithm. Therefore, if d is the number of elements of 
X, then V(pq) < \J2k log d. It is shown that the order of magnitude 
of the bound y/2k log d is tight for d < 2 k : there is C > such 
that for every k and d < 2 k there is a martingale Pq = po, ■ ■ ■ ,Pk of 
probabilities on a set X with d elements, and with variation V(pq) > 
Cy/2k log d. An application of the first result to game theory is that 
the difference between vj» and limfci;/%, where Vk is the value of the 
/c-stage repeated game with incomplete information on one side with 
d states, is bounded by \\G\\ \f2k~ 1 logd (where ||G|| is the maximal 
absolute value of a stage payoff). Furthermore, it is shown that the 
order of magnitude of this game theory bound is tight. 

Keywords: Maximal martingale variation; posteriors variation; re- 
peated games with incomplete information 



2000 Mathematics Subject Classification: Primary 60G42, Sec- 
ondary 91A20 

* Institute Institute of Mathematics, and Center for the Study of Rationality, The He- 
brew University of Jerusalem, Givat Ram, Jerusalem 91904, Israel. This research was 
supported in part by Israel Science Foundation grants 1123/06 and 1596/10. 



1 



1 Introduction 



Bounds on the variation of a martingale of probabilities are useful in the the- 
ory of repeated games with incomplete information. Such martingales arise as 
sequences of an uninformed player's posteriors p$ = Po, ■ ■ ■ ,Pk of an unknown 
game parameter. The martingale's variation, V(pq) := \\p t — p t -i\\i, 

bounds from above (a positive constant times) the payoff advantage that the 
more informed player has over the less informed one in a two-person zero- 
sum fc-stage repeated game with incomplete information on one side; see 

[H El HIE]. 

The maximal variation of a martingale p$ of probabilities over a finite set 
depends both on the initial probability p = p , and on k. It is bounded by 
a positive constant C(p) times the square root of k. This inequality is used 
in Aumann and Maschler to prove that the speed of convergence of the 
minmax value v & of the fc-repeated game with incomplete information on one 
side and perfect monitoring is 0(1/ \/k). Zamir [5] proved the tightness of 
this bound: there is a repeated game with incomplete information on one 
side and perfect monitoring for which the error term, — limvk, is greater 
than or equal to 1/ \fk. 

Mertens and Zamir |3j showed that C (p) is less than or equal to \Jd — 1 , 
where d is the number of elements in the support of p, and the error term is 
less than or equal to ||6?||\/<i — 1 /y/k, where ||G|| is the the maximal absolute 
value of a payoff in one of the possible d single-stage games. 

The objective of the present paper is to improve the order of magnitude 
of the term \/d in the above-mentioned bounds. The main result of the 
paper is that V(p$) < ^j2kH(p) < \/2k logrf, where H is the entropy func- 
tion. This inequality implies that the error term is less than or equal to 
\\G\\y / 2H(p) j \/k, which is less than or equal to ||G||\/2 \ogd / \fk . 

We also provide tightness results for both the variation of a martingale 
of probabilities and the error term in repeated games with incomplete in- 
formation on one side: there exists a positive constant C such that for all 
positive integers k and d with 1 < d < 2 k there is (1) a martingale of prob- 

1 This book is based on reports by Robert J. Aumann and Michael Maschler which 
appeared in the sixties in Report of the U.S. Arms Control and Disarmament Agency. 
See "Game theoretic aspects of gradual disarmament" (1966, ST-80, Chapter V, pp. VI- 
V55), "Repeated games with incomplete information: a survey of recent results" (1967, 
ST-116, Chapter III, pp. 287-403), and "Repeated games with incomplete information: 
the zero-sum extensive case" (1968, ST-143, Chapter III, pp. 37-116). 
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abilities on a finite set with d elements p$ '■ Po, ■ ■ ■ ,Pk with variation greater 
than Cy/k log d , and (2) a repeated game with incomplete information on 
one side with an error term that is greater than C||G|| \/\ogd / \fk . 

2 The results 

Let X be a finite (or countable) set. For x G X, the x-th coordinate of 
an element q G M. x is denoted q(x), and £\{X) is the (Banach) space of all 
elements q G M x with Yuxex \l( x )\ < 00 • Obviously, if X is a finite set, then 
£i(X) equals R x . The i x norm of g G £ X {X) is ||g||i := ^ lgX \q(x)\, and 
(thus) the £1 distance between two elements p,q G ^i(X) is the sum \\p — 
q\\i = J2xex \p( x ) ~ q( x )\- A fc-step £i(X)-valued martingale is a stochastic 
process p$ = Po,---,Pk where p t , < t < k, takes values in £%(X) and 
E{pt I Po, ■ ■ ■ iPt-x) — Pt-i- Let A(X) denote all probabilities on X, i.e., 
all elements p G M. x with ^2 xeX p(x) = h an d f° r V £ A(X) an d a positive 
integer k we denote by Aik(X, p) the set of all martingales p$ with p t G A(X) 
and po = P- 

The variation of the martingale p$ is denoted V(pq) and is defined by 
V(p k ) = E (J2 k t=l Uft-^ll!). Set 

V(k,p) := sup{V(pg) : e Mfe(*,p)} (1) 

and 

7(Jfe,d) := sup{F(A;,p) :pG A(X) and |X| = d}. (2) 

A trivial inequality is V(k,p) < 2k. A classical bound (that is used in the 
theory of repeated games with incomplete information; see [U [3]) of V(k, d) 
is 

V(k,d) < y/k(d-l). 

This classical bound improves the trivial bound only for d < 4k. Our objec- 
tive is to derive a meaningful bound that (1) is applicable also to d > 4k, 
and (2) such that its order of magnitude is the best possible one for large d. 
We have 

Theorem 1 

V(k,p) < y / 2kH(p) 

and thus 

V(k,d) < y/2k\ogd, 
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where H(p) = — Y1 X P( X ) logp(x) is the entropy function and log stands for 
the natural logarithm. 

As V(k,p) < 2k, the results of Theorem [1] are of interest for H(jp) < 2k 
and for d < e 2k . For large values of d < e 2fc , the bound \/2Mogd is a 
significant improvement over the classical bound ^/k(d — 1). Moreover, as 
there are probabilities p over a countable set X with finite entropy, the bound 
y/2kH{jp) is applicable independently of the size of the set X. 

One may wonder if the order of magnitude of each of the bounds, \/2k~ log d 
and yj2kH{jp), are the best possible. For X = {0, 1} and p(a) = (a, 1 — a) G 
A(Jf) we have V(fc,p(a)) < — a). As a(l — a) = o(H(p(a))) as 

a — > 0+, the order of magnitude of the bound \JkH{p) is not tight. The 
next result demonstrates the tightness of the order of magnitude of the bound 
y/2k log d for large values of du We have 

Theorem 2 There is a positive constant C > such that for every k and d 
with d < 2 k there is p$ G A4k{X, po) with \X\ = d such that 

V(p k )>C^/kh^d. 

Bounds of the variation of martingales of probabilities are useful in the 
study of repeated games with incomplete information pQ. In a two-person 
zero-sum repeated game with incomplete information on one side (henceforth, 
RGII-OS) the players play repeatedly the same stage game G. However, the 
game depends on a state x E X known only to player 1 (PI) and x is chosen 
according to a probability p G A(X) that is commonly known. In the course 
of the game player 2 (P2) may learn information about x only from past 
actions of player 1. 

Formally, a RGII-OS T is defined by a state space X, a probability p G 
A(X), finite sets of stage actions, / for PI and J for P2, and for every 
x G X we have a two-person zero-sum I x J matrix game G x . We write 
T = (X,p, I, J,G), where G stands for the list of matrix games {G x ) x ^x- 

2 1 wish to thank Benjamin Weiss for raising the question of the tightness of the factor 
\f\ogd in the bound, and demonstrating for each positive I the existence of a simple 
martingale of probabilities Pq on a set with 2 £ elements and with variation I. Specifically, 
starting with the uniform probability, in each stage half of the non-zero probabilities (each 
half equally likely) move to zero, and the other half double their probabilities. Therefore, 
for each fixed a > there is a positive constant < C(a) (— > Q ^o+ 0) such that for k and 
d with a < < 1, V(k, d) > [log 2 d] > C(a)^k\ogd. 
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The (i,j)-th entry of G x , denoted Gf,-, is the payoff from P2 to PI when in 
state x the players play the action pair 

The fc-stage repeated game, denoted Tk(p), or for short, is played 
as follows. Nature chooses x G X according to the probability p. PI is 
informed of nature's choice x, but P2 is not. At stage 1 < t < k, PI chooses 
i t G I and simultaneously P2 chooses j t G J (and these choices are observed 
by the players following the play in stage t). The choice of i t may depend 
on x,ii,ji, . . . ,it-i,jt-i (which is the information of PI before the play at 
stage t) and the choice of j t may depend on . . . ,it-i,jt-i (which is the 
information of P2 before the play at stage t). 

A pair of strategies a of PI and r of P2 (together with the initial proba- 
bility p) define a probability distribution P? T , or P aT for short, on the space 
of plays x,ii, ji, . . . ,ik, jk, and thus on the stream of payoffs g t : = G? - t . 
The (normalized) payoff of the fc-stage repeated game is the average of the 
payoffs in the fc-stages of the game, namely, Qk = \ Y2t=i 9t- The m i nm & x 
value of Tk(p) is Vk{p) '■= max ff min T E atT (jk, where E a ^ T stands for the expec- 
tation with respect to the probability P£ T , the maximum is over all mixed 
(or behavioral) strategies a of PI, and the minimum is over all mixed (or 
behavioral) strategies r of P2. 

For fixed components (X, /, J, G), the minmax value of the matrix game 
^2 x q(x)G x is a function of q G A(X) and is denoted u(q). The least con- 
cave function on A(X) that is greater than or equal to u ("smallest con- 
cave majorant") is denoted cavu. Aumann and Maschler p] proved that 
Vkip) > (cavw) (p) and that Vk(p) converges to (cavw) (p) as k — > oo . More- 
over, [1] shows that the bound of the variation of the martingale of proba- 
bilities bounds the (nonnegative) difference v^{p) — {emu) (p). Explicitly, if 
\\G\\ := maXj-^ j \G X A, we have 

v k (p) - (cavw) (p) < \\G\\V(k,p)/k. (3) 

Inequality d2J) yields on the one hand a rate of convergence of Vk(p), and on 
the other hand enables us to approximate the value Vk{p) for a specific A; and 
a specific game. The classical bound of V(k,p) that is used in [U [3] and in 
subsequent works is 

V(k,p) < V{k,d) < y/k(d-l) . 

For d > k this bound is not useful. Theorem [1] provides an effective bound 
when d is subexponential in k, namely, when log d = o(k), or, more generally, 



5 



when H(p)/k is small. Applying the bound in Theorem [T] to the inequality 
([3D implies that 

« fc (p) - (cavti) ( P )<\\G\\^^f^. (4) 

One may wonder if the order of magnitude of the bound in (Tj0) is tight. 
We have 

Theorem 3 There is a positive constant C such that for every k and d with 
d < 2 k there is a repeated game V with incomplete information on one side 
with (\\G\\ > and) d states such that 

v k (p) - (cavu) (p) > C\\G\\V(k, d)/k. (5) 



3 Proofs 

Proof of Theorem^ Let p$ be ("H^-adapted; that is, p t is measurable with 
respect to the cr-algebra Ht C %t+i- Without loss of generality we can assume 
that Ht are finite, namely, algebras. (Indeed, if p t is measurable with respect 
to the ex-algebra H t C Ht+i, ° ne replaces % t with an algebra %l C Ht+i 
such that \\E(p t \ HI) — Pt\\ < s/k, and replaces p t with p t := E(p t \ %l) 
(= E{p k | UD). Note that YLi \\Pt ~ Pt-i\\ + 2s > J2t=i \\Pt ~ Vt-i\Y) 
In that case we can assume that: (1) P is a probability on the product 
X x (x^ =0 74f), where A t are finite sets (e.g., the atoms of the algebra T-L t ); 
(2) (x, do, ai, . . . , ak) is a vector of random variables having distribution P; 
and (3) p t is the conditional distribution of x given do, • • • , a t . Let P t be the 
conditional (joint) distribution of (x, a t ) given (do, • • • , ctt-i), Ptx its marginal 
on X, and P t ^ t its marginal on A t . Let P t x ® PtA t denote the product 
distribution on X x A t , i.e., P t x ®PtA t { x i a t) = Ptx(x)PtA t ( a t)- By Pinsker's 
inequality (see, e.g., J2j p. 300]), we have 

\\P t -P t x®PtA t \\ < V2^D(P t \\P tx ®P tAt ), 

where for two probabilities P and Q on a finite (or countable) set Y, \\P — 
Q\\ = E y \P(y)-Q(y)\ and D(P\\Q) = E, e y P{v) log (where log de- 
notes the natural logarithm and OlogO = 0). 

Let H Pt (x) := — ^ P t (x) logP t (x), H Pt (a t ), and H Pt (x,a t ), denote the 
entropy of the random variables x, a t , and (x,a t ), where (x,a t ) has dis- 
tribution P t , and Hp t (x \ a t ) := H Pt (x,a t ) — H Pt (a t ). A straightforward 
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computation yields D(P t \\P tx ® PtA t ) — Hp t (x) — H Pt (x \ a t ). Therefore, 

\\P t -Ptx®PtA t \\ <V2y/H Pt (x)-H Pt (x\a t ). (6) 

Note that Pt is a random variable, which is a function of do, ... , ot-i, and 
therefore, by the properties of conditional entropy, E P H Pt (x) = H P (x \ 
a , • • • , a t -\) (where E P denotes the expectation with respect to the probabil- 
ity distribution P) and E P H Pt (x \ a t ) = H P (x | a , . . . , at-i, a*)- Therefore, 

E P (H Pt (x) - H Pt (x | at)) = H P (x \ a Q ,...,a t -i) - H P (x \ a , . . . , a t -i, at). 

As the square root is a concave function we have, by Jensen's inequality, 

E P || P t - P tx ® P t At II < v^a/ H P (x | a , • • • , at_i) - H P (x | a , . . . , a<_i, a t ). 

Asflpaift-j^xll 17^0 equals £ aeAt *Wa) EJ3£$ - = 
EaeA t Y.x \ p t(x,a) - P tAt {a)P t x{x)\ = \\Pt - p tx ® PtA t ||, we deduce that 
£?p||pt — Pt-i || = Pp||Pt — Ptx ® -PtA< II an d therefore by substituting £7p||pt — 
p t _i|| for ^p||P t - Ptx ® PtA t II we g et 

^p|bt-Pt-i|| < v^a/ -ffp(a; | a , . . . , a t _i) - i^ P (x | a , . . . , a t -!,a t ) . 

As the square root is a concave function, using Jensen's inequality and 
the equality and inequality Ylt=i {H{x | a , . . . , a*-i) — -£P^ | a , . . . , a t )) = 
H(x) — H(x | do, . . . , a*.) < H(x), we have 

This completes the proof of Theorem [TJ □ 

Proof of Theorem^ Note that V(k, d) is monotonic increasing in d and 
fc, and there is a positive constant C\ > such that V^fc, 2) > C\\fk. 

If Pq 1 an d ?o 2 are two martingales with total variation V\ and V2, respec- 
tively, then po <8> 9o> • • ■ , Pfci <8> <?o is a martingale with total variation V\ and 
Pfci <8> ?o>Pfei ® ?i, • ■ • ,Pfc! <8> is a martingale with total variation V2 and 
therefore po ® <Zo, - - - , Vkx ® %,Vki® Qx, • ■ ■ ,Vki® Qk 2 is a martingale with total 
variation V\ + Vi- Therefore, 

V(k 1 ,p) + V(k 2 ,q) < V(h + k 2 ,p®q), (7) 
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from which it follows that 



V(kt, dO + V(k 2 , d 2 ) < V{ki + k 2 , d 1 d 2 ). (8) 

Inequality ([H]) implies that if A; is a multiple of i we have V(k, 2 e ) > 
£V(k/e,2) > ICx^fkfl = CiVki. Note that for every k and 2 < d < 2 k 
there is k > k\ > k/2 that is a multiple of £ = [\og 2 d] > (log 2 d)/2 (where 
[x] is the largest integer < x), and therefore V(k, d) > V(ki, 2 l ) > C\\/k\£ > 
G\j2sJk log 2 d. This completes the proof of Theorem [2j □ 

Proof of Theorem 0. Given two repeated games with incomplete infor- 
mation on one side, T 1 = (Xi,px, I±, Ji, G 1 ) and T 2 = (X 2 ,p 2 , 1 2 , J 2 , G 2 ), we 
define the game r = Ti <g> T 2 by 

r=(I = I 1 xI 2)?) = j )l ® fe / = / 1 x/ 2 x {1, 2}, J = J, x J 2 , G) , 

where for x = (xi,x 2 ), i = (i 1 , z 2 , b), and j = (j 1 ,^ 2 ) G J, 

("IX (~lX h 

^ij ~ ^i b ,j b ' 

where G Xb stands for the more explicit G b ' Xb . Note that \\G\\ = maxdlG 1 )!, ||G 2 | 
A possible helpful interpretation of T is that nature chooses a pair x% G 
X\ and x 2 G X 2 , equivalently a pair of games G X1 and G X2 , according to 
the product probability p\®p 2 . PI is informed of the choice (G Xl ,G X2 ) of 
nature, but P2 is not. In each stage of the repeated game, both players select 
strategies for the first and for the second game, and PI chooses in addition 
which one of the two games determines the stage payoff. 

As a function of % = (i 1 , i 2 ,b), for each fixed b = 1, 2, the payoff function 
Gfj does not depend on the coordinate i c for c ^ b. Therefore we can replace 
the set I (which has 2|/i||/2| elements) of stage actions of PI in the repeated 
game F with the disjoint union of l\ and I 2 . 

Note that if v b k and Vk stand for the (normalized) values of the fc-stage 
repeated games T b and T, then 

hv^ + k 2 v 2 k2 



h + k 2 

Indeed, PI can play b t = 1 in stages t = l,...,ki and b t = 2 in stages 
t = k\ + 1, . . . , ki + k 2 , and the first coordinates i\ of i t follow, in stages 
t — 1, . . . , ki, an optimal strategy of PI in rL (pi), and the second coordinates 
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it of i t follow, in stages t = k\ + 1, . . . , ki + hi, an optimal strategy of PI in 

For £ > 2 and a sequence T 1 = (X\,px, 1%, J±, G 1 ), . . . , T e = (Xe,p£, It, Ji, G e ) 
of RGII-OS, we define by induction on £ the game T = ® e b=1 T b by F = 

If vl, respectively Vk, denotes the normalized value of the fc-stage repeated 
game T b k (pb), respectively r k{®i=iPb) , and k = k\ + . . . + ke, then 

. EiU hvj 
Vk > ; -■ 



Note that a stage action of PI in T is a list of stage actions i 1 , . . . , i (with 
i b G lb) and a number b (with 1 < b < £). However, given b, the payoff 
depends only on the coordinate i b of the stage actions. Therefore we can 
replace the stage actions of PI in T with the disjoint union of the action sets 
Ib, and so with a set of size ^ 6 |i&|. 

Consider the example of the RGII-OS T z = (X = {0, 1}, (1/2, 1/2), J, J, G), 
introduced by Zamir p3 Section 3]. The set of states is X = {0, 1}, and play- 
ers' action sets are I = {0, 1} for PI, and J = {0, 1} for P2. The two payoff 
matrices are G° and G 1 : Gq = 3, Gq X = —1, Gq = 2 = —G\ x , and 
G* ■ = — -. Let denote the normalized value of the /c-stage repeated 
game T z . Zamir |5J shows that \im n v* = and v\ > Ci/y/k, where C\ > 
is a positive constant. 

Consider the RGII-OS T = ®i =1 T z , and let i>k denote the normalized 
value of the fc-stage repeated game Y. It follows that 

t> fc > max < — : k b > and k b = k 

{ b=l 

and therefore if is a multiple of £ we can take kf, = k/£ and therefore 

Vk>v z k/i >C iy ^/k. 

For an arbitrary k and d < 2 k , there is fc > fci > k/2 that is a multiple 
of i := [log 2 d] (> (log 2 d)/2). As PI can play (1/2, 1/2) in the last k - k x 
stages, kvk > kiv^, and thus kv^ > C\\fk\\Tt- Therefore, Vk > A/log d. 

Finally, the existence of an optimal strategy of P2 in the infinitely re- 
peated game T z (or a direct computation of the function u(p), the minmax 
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value of the game ^2 x p(x)G x , for the game T) yields lim n v n = 0. Note that 
the stage payoffs of the RGII-OS T are bounded by 3 (independent of the 
number of factors £). Altogether, we have constructed for each k and d < 2 h 
a repeated game T = (X,p, I, J,G) with \X\ < d, equivalently \X\ = d 
< 21ogd) and ||G|| = 3, and 

v k - lim v n > Ci/2-y/log d/Vk. 

n— >oc 

This completes the proof of Theorem [3j □ 



4 Remarks 

4.1 Comments on the proof of Theorem 1. 

Our proof of Theorem 1 relies on Pinsker's inequality, and it uses information- 
theoretic tools. In fact, the information-theoretic intuition has led us to 
the result and its proof. However, readers unfamiliar with the information- 
theoretic concepts may find the proof obscure. The following is an alternative 
derivation (which disguises the use of the information-theoretic techniques) 
and uses classical martingale theory techniques. First, note that Pinsker's 
inequality implies that if Z and Y are two nonnegative random variables with 
EZ = EY, then E\Z-Y\< V2EZ^EZ log Z - EZ log Y. In particular, if 
E(Z \ Y) = Y (e.g., when Y is the constant random variable Y = EZ), then 
EZhgY = E(E(Z\ogY | Y)) = EY log Y, and therefore 

E\Z-Y\ < V2EZ V / EZ log Z - EZhgY = \/2EZ^/EZ log Z — EY log Y. 

(9) 

Inequality fl9]), which is equivalent to Pinsker's inequality, can obviously be 
proved directlyH The continuation of the proof avoids the (explicit) use of 
information-theoretic techniques. 

It follows from (Q that if Zq, . . . , Z^ is a martingale of nonnegative ran- 
dom variables, then 

k k 

\Z t - Z t _ x \ < V2EZ~oY, y/EZtlogZt-EZt^logZt^, 
t=i t=i 

3 I wish to thank Stanislaw Kwapien for suggesting a proof that avoids the information- 
theoretic techniques, and communicating a simple analytical proof of the above displayed 
version of Pinsker's inequality. 
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which by Jensen's inequality, the concavity of the square root, and the tele- 
scopic feature of the series EZ t \ogZ t — EZ t _i \ogZ t _x, is 



< \J 2kEZ y/ EZ k logZ k - EZ log Z . 
Therefore, if po, ■ ■ ■ ,Pk is a martingale with values in we have 

k 

^E\\p t < ^2 V 2kE Po(x) V Epk(x) \ogp k {x) - Ep (x) \ogp (x). 

By the Schwartz inequality we obtain that 

J2 E \\Pt-Pt-i\\ < 2kE^2p (x) JE^2 {Pk{x)\ogp k {x) - p Q {x)\ogp G (x)) 
t=i y xax y xex 

If Pk(x) < M(x), then by the convexity of qlogq we have Ep k (x) \ogp k (x) < 
Ep (x) logM(a;), and then 

J2E\\pt-pt-i\\i < hkEj^Poix) Ij2 -Ep (x)\og(p (x)/M(x)) . 
t=i y xex y xex 

If p k (x) < 1, then Pk(x) \ogp k (x) < 0, and therefore 

Z^ E \\Pt -Pt-ih < , 2kE^2po(x) -Epo(x)logp (x) . 
t=i y X £X y xex 

We conclude that if ^2 x Po(x) = 1, then Ylt=i E\\pt—Pt-i\\i < V2k\/ EH(p ). 

4.2 The variation of a martingale of probabilities over 
a countable set. 

It is of interest to find a necessary and sufficient condition for a distribution 
j) on a countable set X for sup fc -^V(k,p) < oo. We remark here on the 
sufficient conditions derived from the classical method and our method of 
bounding the variation of martingales of probabilities. 

The classical bound of the variation of a martingale p$ is obtained by 
bounding, for each fixed x G X, the expectation variation where 



11 



y(x) G M. k is the vector of martingale differences (pi(x) — p (x), . . . ,Pk( x ) — 
Pk-i( x )) (thus ||y(x)||i = Ylt=i \Pt( x ) ~ Pt-i( x )\), an d summing over all x G 
X. Assuming without loss of generality that po is a constant p G A(X), we 
have (by the Cauchy-Schwartz inequality) ||y(x)||i < \/fc||?/(x)||2, and, there- 
fore, by Jensen's inequality, £7||^/(a7) || i < \fk^J E\\y(x) fjf, which by the mar- 
tinga le property is < V k^ E((p k (x)) 2 - (p (x)) 2 ) < y/ky/E((p k (x)) - (p ( x )) 2 ) = 
^/k\/po(x) — (po( x )) 2 - Therefore, if p G A(X) and X is countable, the clas- 
sical method yields that sup fc -^V(k,p) < oo whenever \/p(x) < oo. As 

— glogg = o(y/q) as q — > 0+, the condition XL Vp( x ) < 00 i m phes that 
H{p) = — J2 X P( X ) \°gp( x ) < 00. Obviously, there are probabilities p over 
a countable set X such that i?(p) < 00 but Yl x yp{x) = 00. Therefore 
our bound provides a strictly sharper sufficient condition, H(p) < 00, for 
sup fe -j=V(k,p) < 00, compared to the one derived by using the classical 



4.3 The asymptotic behavior of V(k,d). 

The asymptotic behavior of V(k, d) deserves further study, jl] proves that 
V{k,2)/Vk converges as k — >■ 00 to \[^- It is of interest to find a correspond- 
ing limit theorem for V(k, d) / yjk log d as 2 k > d — > 00. The above-mentioned 
result of [1] together with our construction in the proof of Theorem 2 yields 

that the liminf of V(k, d)/ yfk log 2 d is > i/f as ^jM + 1/d — > (namely, as 
log<i = o(fc) and d — > 00). 



4.4 Repeated games with incomplete information. 

The proof of Theorem [3] constructs for each d < 2 h a RGII-OS T = (X, p, I, J, G) 
with \X\ = d and v k > lim n t> n + C^k~ l log d, where < C = 0(\\G\\), and, 
in addition, |/| = O (log cf) and \J\ = 0(d). We have not tried to minimize 
the order of magnitude of the number of elements of / and J. It is however 
impossible to construct such an example with bounded |/| and \J\. Indeed, 
in a forthcoming note we will show that for every RGII-OS T = (X, p, I, J, G) 

we have v k < lim n t> n + \\G\\*^ ^ J ^ logk , where ||G||* := 2E p (maXijG^ - 
minj j Gfj). Therefore the inequality v k > lim n t> n + C^/k^ 1 log d is possible 
only if \I x J| > C%&. 
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